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ABSTRACT 


Discovering anomalous users in the social network is a crucial problem in analyzing network. The previous works focus on a network with just one type of interaction 
among the entities. However, the relationship among people is highly complex, and users have multiple types of interaction in a social network. On the other hand, 
users tend to form a community in the social network such that normal users usually have friends who are frends themselves, and anomalous users typically do not 
follow this rule. In this paper, we consider the detection of anomalous nodes in the multi-layer social network by combing the information in each layer of the network. 
We propose a pioneering algorithm based on the community detection method and assign the anomaly score to each user and rank them. Experimental result on real 
dataset shows that the proposed algorithm can recognize anomalous users in the multi-layer social network. 
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INTRODUCTION: 

Nowadays, many complex systems are modeled by the network. Arab et al., (2014) and Aqib et al., (2018) utters that network is sets of nodes, representing entities, con- 
nected by edges, representing the relationship between entities based. For example, a social network can be modeled as a graph where nodes represent users; and the 
edges are the relationship between users based on the definition of Diesner et al., (2005). A network that is modeled one type of relation is called a single layer network 
and multi types of relations among entities are viewed as a multi-layer network. Most complex systems are modeled as a multi-layer network. Network analysis is a pow- 
erful tool for analyzing data in many aspects, including detecting community Chunaey, P (2020), find influential nodes Debnath et al., (2020), and recognize abnormal 
behavior in the network Song et al., (2019). In this research, we propose the algorithm to detect anomalous nodes in the multi-layer network. 


The anomaly detection is a difficult task due to there is not any unique definition of the anomaly, and also it depends on the application and problem on hand according to 
Bindu et al., (2017). Chandola et al., (2009) mention that an anomaly is an entity that behaves differently from other entities. The traditional definition of anomaly is 
entity that deviates a lot from other entities by Hawkins (1980). Anomaly is an observation that is inconsistent with other observations based on the definition of 
Nanavati et al., (2008). Detecting anomaly is highly crucial because it causes damage to the system. For instance, anomaly represents fraudulent and illegal behavior in 
the social network that can harm to other users. 


Anomaly detection in the network is different from anomaly detection technique in non-network data because of analyzing the interaction among entities. Hence, it is a 
crucial problem, and researchers pay attention to it in recent years. Most of the works have been proposed to solve anomaly detection ona single layer network; however, 
most of the real systems are modeled as a multi-layer network according to Kivel (2014)- Kunpeng et al (2020). Most anomaly detection algorithms on a multi-layer net- 
work convert a network into a single layer network and applied anomaly detection algorithms that are developed for a single layer. Aggregation multi-layer network into 
asingle layer causes losing hidden information on the network. Therefore, developing an algorithm to detect anomaly in the multi-layer network is an essential and ongo- 
ing research area. In this research, the algorithm purpose of identifying the anomalous node in the multi-layer network. 


The proposed algorithm finds the anomalous node in the multi-layer network by using a community detection concept. The algorithm uses the structure of the node's 
egonent and super-egonent in order to compute the community of each node in the network. Egonent is a one-step neighborhood of node including all its neighbors and 
interactions and its node. Also, super-egonent is one and two-step neighborhood. Then, the algorithm is calculated anomaly score based on community detection of each 
node of the network. After that, the final anomaly score of each node is a linear combination of anomaly scores of a node in different layers. In this algorithm, the node 
contribution concept is introduced as a linear coefficient. Node contribution shows that the degree of the node's importance in each layer. Finally, the anomaly score is 
ranked to recognize anomalous nodes in the network. The proposed algorithm applies to six datasets, and the results show that it recognizes anomalous nodes effectively. 
The organization of the paper is as follows. Section 2 discusses the previous research on anomaly detection. In section 3, the problem and algorithm are expressed. 
Experimental analysis and results are presented in section 4. Finally, the paper concludes in section 5. 


RELATED WORKS: 

Most of the research in this area work on anomaly detection techniques on non-network data. The surveys have been provided on anomaly detection on non-network 
data, including Savage et al., (2016). In one of the overview of general anomaly detection which proposed by Chandola et al., (2009) develops previous works of anom- 
aly detection techniques into six classes, classification, clustering, nearest neighbor, statistical, information-theoretic, and spectral analysis. Anomaly detection on net- 
work data is introduced in the workshop was held at ACM 2013 by Akoglu et al., (2010). In the networked data, anomaly detection has been researched very well Akoglu 
etal., (2010), Chandola et al., (2012), Gao et al., (2010), Hassanzadeh et al., (2012), Hassanzadeh et al., (2013a), Hassanzadeh et al., (2013b), Muller et al., (2013), Sun et 
al., (2010), Sun et al., (2005), Tong et al., (2011), Xuetal., (2007), Yang et al., (2015), Aggarwal et al., (2011), Jietal., (2013), and Miller et al., (2015). The recent survey 
on anomaly detection techniques in the social network data is presented Bindu et al., (2016). These works can be divided into two classes, behavior-based, structure- 
based according to Hassanzadeh et al., (2012). Behavior-based considers users' behavior and structure-based mines users' usage patterns based on Hassanzadeh et al., 
(2012). Also, Zoppi et al (2020) proposed a multi-layer anomaly detection framework for complex dynamic system. A comprehensive novel model for network speech 
anomaly detection system using deep learning approach put forward by Manimaran et al (2020). Some application researches in the area including Ullah et al., (2020), 
Liaetal., (2019) and Zhang et al., (2019). 


However, most of the works consider one type of interaction on the network and apply anomaly detection. Also, most techniques, tools, and algorithms can apply ona sin- 
gle layer network, and we cannot directly use the multi-layer network, and it is necessary to combine a multi-layer network into a single layer. The result of that loses hid- 
den information on the network. There is a limit work on anomaly detection on multi-layer. To the best of our research, there is one research which addresses on Bindu et 
al., (2017). This work uses the network structure based in order to detect anomalous nodes in the multi-layer network. So, it is highly essential to develop anomaly detec- 
tion on the multi-layer network. In this work, we propose the algorithm to detect anomalous nodes in the multi-layer network. 


The Proposed Algorithm: 
In this section, we propose a new algorithm in order to detect anomalous nodes in the multi-layer network. Most of the previous researches focus on a single layer net- 
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work. However, the real networks have multi types of interactions, and they cannot model as a single layer without losing useful information. On the other hand, anomaly 
detection in a complex network is a difficult problem because there is no universal definition of the anomaly and wholly related to the problem at hand. So, developing an 
algorithm or a tool is a highly important and also ongoing area for research. 


In this work, we address the algorithm to capable of detecting anomalous nodes in the multi-layer network, which includes three phases, computing community, comput- 
ing anomaly score, an anomaly score ranking. In the following section, we put forward the definition of the problem, and then we discuss all phases of the algorithm in 
detail. 


Problem Definition: 

The problem is defined as follows. Suppose multi-layer network G={G,.G’.....G, } with a Lnetwork layers, each layer represents one type of interaction and it considers 
as a single layer network [G,=(V,.E,)]. Every node can exist in more than one layer and every single network has its adjacency matrix that can be defined 
A;={A,A,...,A,} where A={1} ifonly ifi andj are connected in layer |. The purpose of this research is to develop an algorithm in order to detect anomalous nodes in the 
multi-layer network. 


SOLUTION METHODOLOGY: 

In this section, we address our algorithm in detail. Our algorithm calculates the community of each node in individual layers. Then, the anomaly score assigns to all nodes 
of the network according to the community of nodes. The anomaly score of the corresponding nodes from each layer is then combined based upon the node contribution 
in each layer in order to calculate total anomaly scores of a node in the multi-layer network. Finally, the nodes of the multi-layer network are ranked based on anomaly 
scores. 


As there is not any interdependency among the operations on individual layers of the network, we compute the community on different layers in a parallel. 


Phase 1- Compute Community: 

Entities of the most real system tend to form communities based on their similarity and interest. Therefore, the information from communities' structure can be useful for 
analyzing entities' behavior in order to detect anomalous node. In this work, the anomaly score assigns to each node of the network in every layer based on community 
detection. For this purpose, the following steps are taken. 


1. Generate Super-egonent of each node in every layer of network. For instance, node 1: egonent={i,i,,i,, ...,i,} 
Super-egonent={egonent,,egonent(i, ),..., egonent(i,)} 


2. Based on Infomap community detection algorithm [], the community assign to each node of its super-egonent. For example, the community is assigned to every 
node which exists in Super-egonent,. 


3. Foreachnode in every layer of the network, its neighbor's is selected. 

4. Twoanomaly score assigned to each node in every layer based on the diversity of the he neighbor's community and similarity of node community with its neigh- 

o Cc 
and 55: 


bors' community which demonstrate with SF; respectively. 


c c 
If all neighbors of node i belong to the same community or the node i dose not have any neighbors, the SF" of the node i is equal to zero and SF of node i 
c 
become more if the variety of neighborhood of community is high. So, the SF; s calculated as follow: 


all neighbors have same community or no neighbors 


0 
SFE = { Number of dif ferent communities o.w 


Total node in Eqonent-1 


c c 
SSi" is calculated based on the similarity of the community of node i with its neighbors' communities. If the node i does not have any neighbors, the SSi ofthe 





Cc 
node iis equal to zero. So, the SS; is calculated as follow: 
0 no neighbors 
SS; =i number of neighbors with same community of node i a 
Total node in Eqonent—1 : 


5. Finally, two anomaly scores combine by the following equation. 


SE = a, SFE + aaSSE 
Where a, = a2 = ; 


Phase 2- Compute Anomaly Score: 

After computing anomaly score based on community, the final anomaly score of each node is calculated as linear combinations of anomaly scores in the each single net- 
work. Since each relationship may have different significance in the real world, assigning different degrees to each node in different layers of the network is more reli- 
able. So, node contribution is defined in order to calculate the importance of each node in the individual layer. Node contribution demonstrates how a specific node and 
its neighbors are connected to each other and consider as the coefficient of linear combinations. Providing that the connectivity of the node is dense in the specific layer, 
the node contribution is high in comparison of this node in other layers which the connectivity of the node is sparse. More formally, the node contribution for each layer is 
defined as: 


Nc! = ——— 
Yea di 


l 
Whereis %i the degree of nodei in layer |". 
Final anomaly score of each node is defined as follow. 


1 


AnomalyScore; = > NC} x Sf 
1=1 


Phase 3- Ranking Anomaly Score: 
In the previous phases, the anomaly score is computed for each node. After the calculation of nodes’ anomaly score, the nodes are ranked based upon their anomaly score. 
In other words, the nodes are sorted based upon their degree of deviation from the normal. 
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EXPERIMENTAL ANALYSIS AND DISCUSSION: 

The effectiveness and efficiency of our algorithm is examined by six real multi-layer networks. We implement the algorithm on an Intel Core I5 CPU@ 2.60 GHz 
machine with 6 GB RAM running on window 8 operation system. The algorithm is implemented in R programming language and using igraph library. As there is no cor- 
relation among the operations ina different layer, we implement all phases of our algorithm in parallel. First, the six datasets are introduced and then the result of our algo- 
rithm on these datasets is discussed. 
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1. Noordin Top Terrorist Network: The Noordin Top Terrorist Network which proposed by Roberts et al., (2011) is a four layer multi-layer network of Indone- 
sian terrorists. The dataset includes information of 78 terrorists about their operation, communications, trust, and financial, among them. 


2. Social Evolution Dataset: The Social Evolution Dataset which proposed by Madan et al., (2012) is a five layer multi-layer network of student of MIT dormi- 
tory. The dataset includes information of student relationships such as close friend, socialize twice per week, political discussant, Facebook all tagged photos, 
and blog live journal Twitter. 


3. Aarhus: The Aarhus which proposed by Magnani et al., (2013) is a five layer multi-layer network of social interactions of research department of Aarhus Uni- 
versity. The dataset includes information about lunch, co-authorship, Facebook, work, and leisure. 


4. DBLP_C: The DBLP_C which proposed by Berlingerio et al., (2013) is a six layer multi-layer network of co-authorship on the computer science conference. 
Each layer of this network is related to a specific conference. The node of network is an author, and two author are connected to each other if they have paper 
together. The dataset includes information about VLDB, SIGMOD, CIKM, SIGKDD, ICDM, and SDM. 


5. ArXiv: The arXiv which proposed by De Domenico et al., (2015) is a 13 layer multi-layer network of co-authorship network of the free scientific repository 
arXiv. The dataset includes information about physics.soc-ph.data-an, physics, physics.bio-ph, math.OC, math-ph, cond-mat.stat-mech, cond-mat.dis-nn, q- 
bio, nlin.AO, q-bio.BM, cs.SI, cs.CV. 


6. GTD: The Global Terrorism Database (GDT) which proposed by Berlingerio et al., (2011) is a 124 layer multi-layer network of terrorist attack incidents in the 
world. The nodes of this network are terrorist organizations and they are connected to each other if they have attacked the same country in the same year. Also, 


each layer represents one country. We use all terrorist attacks occurred during the year 1970-2008. 


The summary of each dataset is presented on table 



































Dataset Nodes Edges Layers 
Noordin Top 78 911 4 
Social Evolution 84 31,918 5 
Aarhus 61 620 5 
DBLP 6,771 19,345 6 
arXiv 14,065 59,026 13 
GTD 2,509 B29) 124 
RESULTS AND DISCUSSION: 


The validation of anomaly detection algorithms is not simple because there is no labeled dataset [22, 67], and there is no standard method for the validation of anomaly 
detection algorithms [5, 14]. However, we evaluate the result of our algorithm on six real multi-layer networks. 


Moreover, most of the work in the anomaly detection of network data conduct in a single layer network and researchers convert a multi-layer network into a single layer 
in order to apply developed algorithms for the single layer. In this work, we aggregate all information that exists in different layers to get a single-layer network in order to 
apply the traditional network analysis algorithms. The aggregated network is called the aggregation algorithm. The nodes of the aggregation network are all nodes in the 
multi-layer network, and edges are all edges in the multi-layer network. Also, implement the proposed algorithm in the aggregation network. 


We compare our results with the aggregation algorithm. Top ten ranked nodes recognized by our algorithm, and the aggregation algorithm are shown in the following 
table. Since there is no labeled data with ground truths, the top anomalous nodes are manually considered in order to whether they are anomalous or not in the Noordin 
Top data node 57 is a top anomalous node in the Noordin dataset which indicates Mohamed, the head of the terrorist group. All results demonstrate that in the table — 






























































Dataset Rank Our Algorithm Aggregation Algorithm 
Node Node 
Noordin Top 1 Si Syl 
2 21 21 
3 70 71 
4 63 23) 
5 67 66 
6 43 44 
T 68 68 
8 4 4 
9 51 al 
10 We 50 
Social Evolution 1 15 13 
2 5a 41 
3 12 8 
4 11 33) 
5) 82 75 
6 49 Wl 
v 67 58 
8 19 Ney, 
9 53) 56 
10 993 2495 
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10 74 82 
Aarhus 1 7 7 
2 22 1 
3 10 60 
4 28 2) 
5 33 34 
6 44 44 
ii 36 26 
8 20 53 
) 8 2p 
10 50 61 
DEEPA 1 6815 558 
2 558 4131 
3 720 720 
4 6187 534 
5 4181 1358 
6 559) 6818 
7 6810 14234 
8 6178 3525 
2) 3525) 16216 
10 6813 233 
ArXiv 1 125 479 
D 127 83 
3 8156 218 
4 468 2) 
5 10463 54 
6 10464 715 
ii 3680 578 
8 10936 80 
9 10939 1751 
10 480 842 
GTD 1 406 81 
D 253 161 
3 241 134 
4 372 1460 
5 1998 2245 
6 2057 109 
7 1784 ail 
8 952 836 
9 2D) 203 
CONCLUSION: 


Detecting anomalous nodes in the multi-layer social network is a serious problem. Even though various techniques and algorithms have been proposed for single layer 
network, there is limit work on the multi-layer network it is an unexplored area of research. 


In this research, we proposed the algorithm based on community detection method in order to detect anomalous node in the multi-layer network. Since, people tend to 
form community in the social network, so the normal user have the same community as its friends. The neighborhood of anomalous nodes have different community. We 
use this concept to detect anomalous node in the network data. We assign the anomaly score to each node in different layer based on the community of their neighborhood 
after that we combine different anomaly score of specific node in different layer with each other based on node contribution. After that, the anomaly score of each node is 
ranked. 


There is no standard technique to evaluate the algorithm and our algorithm is the pioneer algorithm in this area. So we validate our algorithm in the six real dataset and 
evaluate manually the result. 
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